Processing definite descriptions in corpora
نویسندگان
چکیده
We discuss in this paper a system that resolves definite descriptions in written texts. A preliminary study of definite descriptions in a collection of 20 texts revealed that about 30% of the 1040 definites in the collection were cases of anaphoric definites whose antecedents had the same head noun, and 50% introduced novel discourse referents. An algorithm which resolves anaphoric definite descriptions and also identifies novel ones is proposed. We evaluated the algorithm by comparing its results with an annotation produced by human subjects. The analysis of the corpus, the implemented algorithm, and the evaluation of the results are presented in this paper.
منابع مشابه
Corpus-based Development and Evaluation of a System for Processing Definite Descriptions
We present an implemented system for processing definite descriptions. The system is based on the results of a corpus analysis previously reported, which showed how common discourse-new descriptions are in newspaper corpora, and identified several problems to be dealt with when developing computational methods for interpreting bridging descriptions. The annotated corpus produced in this earlier...
متن کاملAn Empirically-based System for Processing Definite Descriptions
We present an implemented system for processing definite descriptions in arbitrary domains. The design of the system is based on the results of a corpus analysis previously reported, which highlighted the prevalence of discourse-new descriptions in newspaper corpora. The annotated corpus was used to extensively evaluate the proposed techniques for matching definite descriptions with their antec...
متن کاملMultilingual Corpora Annotation for Processing Definite Descriptions
This paper presents a multilingual corpora study aimed to verify the applicability of heuristics developed for coreference resolution in English texts to Portuguese and French language.
متن کاملNominal Expressions in Multilingual Corpora: Definites and Demonstratives
This paper presents the results of a multilingual corpus study on definite descriptions and demonstrative noun phrases. The analysis made on a parallel corpus (French and Portuguese) reinforces previous findings regarding the predominance of non-anaphoric uses of definite descriptions in English corpus. It is also shown that the use of demonstrative noun phrases, on the other hand, is more regu...
متن کاملThe or That: Definite and Demonstrative Descriptions in Second Language Acquisition
Since Heubner's (1985) pioneering study, there have been many studies on (mis) use/ non-use of articles by L2 learners from article-less and article languages. The present study investigated how Persian L2 learners of English produce and interpret English definite descriptions and demonstrative descriptions. It was assumed that definite and demonstrative descriptions share the same central sema...
متن کامل